Block Based Fetch Engine for Superscalar Processors
نویسندگان
چکیده
The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue small number of instructions per cycle, it is not for processors with higher peak issue rates. In this paper we proposed a new architecture, which combine the enhanced branch target buffer with the block cache to fetch multiple blocks. The enhanced branch target buffer enhanced the capability of traditional branch target buffer to store multiple branch targets. Block cache keeps the instruction streams that collected at commit stage as blocks. These blocks are renamed and the renamed index is stored in the enhanced branch target buffer as the basic fetching unit. The proposed architecture has been simulated using SimpleScalar 2.0 simulator. The results show that the average instruction fetch pet cycle reaches 4.84 instructions, the average instruction issued per cycle is 3.44 instructions and the average IPC is 2.07. As compare to the baseline machine, we got 87%, 84%, and 56% increase respectively.
منابع مشابه
Block - Level Prediction for Wide - Issue Superscalar Processors
Changes in control ow, caused primarily by conditional branches, are a prime impediment to the performance of wide-issue superscalar processors. This paper investigates a block-level prediction scheme to mitigate the e ects of control ow changes caused by conditional branches. Instead of predicting the outcome of each conditional branch individually, this scheme predicts the target of a sequent...
متن کاملScalable Hardware Mechanisms for Superscalar Processors
of the Dissertation Scalable Hardware Mechanisms for Superscalar Processors by Steven Daniel Wallace Doctor of Philosophy in Electrical and Computer Engineering University of California, Irvine, 1997 Professor Nader Bagherzadeh, Chair Superscalar processors fetch and execute multiple instructions per cycle. As more instructions can be executed per cycle, an accurate and high bandwidth instructi...
متن کاملAn Exploration Of Instruction Fetch Requirement In Out-of-order Superscalar Processors
Automated design of superscalar processors can provide future in terms a cycles-per-instruction (CPI) using the application program statistics and the 124, Optimization of Instruction Fetch Mechanisms for High Issue Rates 117, A first-order superscalar processor model Karkhanis, Smith 2004 (Show Context). Because superscalar architectures include complicated control logic for out-of-order execu...
متن کاملThe Basic Block Reassembling Instruction Stream Buffer with LWBTB for X86 ISA
The potential performance of superscalar processors can be exploited only when processor is fed with sufficient instruction bandwidth. The front-end units, the Instruction Stream Buffer (ISB) and the fetcher, are the key elements for achieving this goal. Current ISBs could not support instruction streaming beyond a basic block. In x86 processors, the split-line instruction problem worsens this ...
متن کاملUniversity Wednesday , 10 May 2000 Trace Cache
Due to unfortunate circumstances this lecture was not scribed, following are several points that I remember were brought up. If anyone has something to add please tell me. In this session we discussed three papers: Alternative Fetch and Issue Policies for the Trace Cache Fetch Mechanism-describes several enhancements to the original University of Michigan view of the trace cache. Path-Based Nex...
متن کامل